Instance-based Schema Matching for Web Databases by Domain-specific Query Probing

نویسندگان

  • Jiying Wang
  • Ji-Rong Wen
  • Frederick H. Lochovsky
  • Wei-Ying Ma
چکیده

In a Web database that dynamically provides information in response to user queries, two distinct schemas, interface schema (the schema users can query) and result schema (the schema users can browse), are presented to users. Each partially reflects the actual schema of the Web database. Most previous work only studied the problem of schema matching across query interfaces of Web databases. In this paper, we propose a novel schema model that distinguishes the interface and the result schema of a Web database in a specific domain. In this model, we address two significant Web database schemamatching problems: intra-site and inter-site. The first problem is crucial in automatically extracting data from Web databases, while the second problem plays a significant role in meta-retrieving and integrating data from different Web databases. We also investigate a unified solution to the two problems based on query probing and instance-based schema matching techniques. Using the model, a cross validation technique is also proposed to improve the accuracy of the schema matching. Our experiments on real Web databases demonstrate that the two problems can be solved simultaneously with high precision and recall.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Instance-Based OWL Schema Matching

Schema matching is a fundamental issue in many database applications, such as query mediation and data warehousing. It becomes a challenge when different vocabularies are used to refer to the same real-world concepts. In this context, a convenient approach, sometimes called extensional, instancebased or semantic, is to detect how the same real world objects are represented in different database...

متن کامل

Holistic Schema Matching for Web Query Interface

One significant part of today’s Web is Web databases, which can dynamically provide information in response to user queries. To help users submit queries to and collect query results from different Web databases, the query interface matching problem needs to be addressed. To solve this problem, we propose a new complex schema matching approach, Holistic Schema Matching (HSM). By examining the q...

متن کامل

Light-weight Domain-based Form Assistant: Querying Databases on the Web

The Web has been rapidly “deepened” by myriad searchable databases online, where data are hidden behind query forms. Helping users query alternative “deep Web” sources in the same domain (e.g., Books, Airfares) is an important task with broad applications. As a core component of those applications, dynamic query translation (i.e., translating a user’s query across dynamically selected sources) ...

متن کامل

Query Architecture Expansion in Web Using Fuzzy Multi Domain Ontology

Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...

متن کامل

Light-weight Domain-based Form Assistant: Querying Web Databases On the Fly

The Web has been rapidly “deepened” by myriad searchable databases online, where data are hidden behind query forms. Helping users query alternative “deep Web” sources in the same domain (e.g., Books, Airfares) is an important task with broad applications. As a core component of those applications, dynamic query translation (i.e., translating a user’s query across dynamically selected sources) ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004